14 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Predicting virulence factors of immunological interest

    No full text
    This article does not have an abstract

    Intrinsic contributions of polar amino acid residues toward thermal stability of an ABC–ATPase of mesophilic origin

    Get PDF
    The nucleotide-binding subunit of phosphate-specific transporter (PstB) from mesophilic bacterium, Mycobacterium tuberculosis, is a unique ATP-binding cassette (ABC) ATPase because of its unusual ability to hydrolyze ATP at high temperature. In an attempt to define the basis of thermostability, we took a theoretical approach and compared amino acid composition of this protein to that of other PstBs from available bacterial genomes. Interestingly, based on the content of polar amino acids, this protein clustered with the thermophiles

    In Silico Analysis of Gene Expression Change Associated with Copy Number of Enhancers in Pancreatic Adenocarcinoma

    No full text
    Understanding the gene regulatory network governing cancer initiation and progression is necessary, although it remains largely unexplored. Enhancer elements represent the center of this regulatory circuit. The study aims to identify the gene expression change driven by copy number variation in enhancer elements of pancreatic adenocarcinoma (PAAD). The pancreatic tissue specific enhancer and target gene data were taken from EnhancerAtlas. The gene expression and copy number data were taken from The Cancer Genome Atlas (TCGA). Differentially expressed genes (DEGs) and copy number variations (CNVs) were identified between matched tumor-normal samples of PAAD. Significant CNVs were matched onto enhancer coordinates by using genomic intersection functionality from BEDTools. By combining the gene expression and CNV data, we identified 169 genes whose expression shows a positive correlation with the CNV of enhancers. We further identified 16 genes which are regulated by a super enhancer and 15 genes which have high prognostic potential (Z-score > 1.96). Cox proportional hazard analysis of these genes indicates that these are better predictors of survival. Taken together, our integrative analytical approach identifies enhancer CNV-driven gene expression change in PAAD, which could lead to better understanding of PAAD pathogenesis and to the design of enhancer-based cancer treatment strategies

    Not Available

    No full text
    Not AvailableDomestic cow, Bos taurus is one of the important species selected by humans for various traits, viz. milk yield, meat quality, draft ability, resistance to disease and pests and social and religious reasons. Since cattle domestication from Neolithic (8,000-10,000 years ago) today the population has reached 1.5 billion and further it’s likely to be 2.6 billion by 2050. High magnitude of numbers, breed management, market need of traceability of breed product, conservation prioritization and IPR issues due to germplasm flow/exchange, has created a critical need for accurate and rapid breed identification. Since ages the defined breed descriptors has been used in identification of breed but due to lack of phenotypic description especially in ova, semen, embryos and breed products molecular approach is indispensable. Further the degree of admixture and non-descript animals characterization, needs of molecular approach is imperative. Till date breed identification methods based on molecular data analysis has great limitations like lack of reference data availability and need of computational expertise. To overcome these challenges we developed a web server for maintaining reference data and facility for breed identification. The reference data used for developing prediction model were obtained from8 cattle breeds and 18 microsatellite DNA markers yielding 18000 allele data. In this study various algorithms were used for reducing number of loci or for identification of important loci. Minimization up to 5 loci was achieved using memory-based learning algorithm without compromising with accuracy of 95%. This model approach and methodology can play immense role in all domestic animal species across globe in breed identification and conservation programme. This can also be modelled even for all flora and fauna to identify their respective variety or breed needed in germplasm management.Not Availabl
    corecore